{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "b65d23f5-44b7-49b7-aa09-3f59fe7c5960",
   "metadata": {},
   "source": [
    "# Pandas缺失值处理"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "edecf8cc-1e96-4730-b3b4-53d713908faf",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "433efe09-ad09-458e-88f4-88a0d8fb7b8d",
   "metadata": {},
   "source": [
    "丢失数据(空值)分为：\n",
    "- None\n",
    "- np.nan"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "05c134f9-c91a-4ca3-91a9-02cd45b451aa",
   "metadata": {},
   "source": [
    "## 1.None"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8168d0aa-5091-4763-9eac-256c97e8cd22",
   "metadata": {},
   "source": [
    "- None是Python自带的，是Python中的空对象。None不能参与任何计算中。\n",
    "- object类型的运算要比int类型的运算要慢的多"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "f6476531-54cf-4dda-a551-fbaa3bb93486",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "45.8 ms ± 2.38 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
     ]
    }
   ],
   "source": [
    "%timeit np.arange(1e6,dtype=object).sum()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "2e9a6f72-f8c6-406f-bfed-07f7238e136d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.81 ms ± 100 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)\n"
     ]
    }
   ],
   "source": [
    "%timeit np.arange(1e6,dtype=np.int32).sum()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ba0a89c7-1a14-48e6-8ea9-98d75cd11b8e",
   "metadata": {},
   "source": [
    "## 2.np.nan\n",
    "- np.nan是浮点类型，能参与到计算中。但计算结果是NaN"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "d93f2e37-0c81-4d4c-a67c-70170bba8805",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "float"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(np.nan)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0450d093-a4de-4963-933f-6f8b8e236ed7",
   "metadata": {},
   "source": [
    "- 但可以使用np.nan*()函数来计算nan,此时会过滤掉nan"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "cb5c2ca3-a4cc-4efe-a705-0cb738882c7f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1.,  2.,  3., nan,  5.,  6.])"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "n = np.array([1,2,3,np.nan,5,6])\n",
    "n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "c4823a9a-3b1d-46d9-80a4-ef93b51d7e66",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "np.float64(17.0)"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.sum(n) # 直接计算nan会返回nan\n",
    "\n",
    "np.nansum(n)# 通过nansum过滤nan"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "68b77f6c-bd79-4faa-94c1-eeb7b49e4e34",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "nan"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.nan+10"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dd2f9f5c-b7f3-4407-b703-a6baa0e40f04",
   "metadata": {},
   "source": [
    "## 3.Pandas中的None和NaN"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "464fcb40-cb0d-4b3b-9f63-aa5ea87e54f2",
   "metadata": {},
   "source": [
    "**1)Pandas中的None与np.nan都视作np.nan**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "id": "c7adfcbf-c741-4d38-804e-c665c0de8665",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>98</td>\n",
       "      <td>28</td>\n",
       "      <td>24</td>\n",
       "      <td>40</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>64</td>\n",
       "      <td>89</td>\n",
       "      <td>15</td>\n",
       "      <td>9</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5</td>\n",
       "      <td>75</td>\n",
       "      <td>89</td>\n",
       "      <td>18</td>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>94</td>\n",
       "      <td>1</td>\n",
       "      <td>59</td>\n",
       "      <td>48</td>\n",
       "      <td>73</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>23</td>\n",
       "      <td>12</td>\n",
       "      <td>31</td>\n",
       "      <td>85</td>\n",
       "      <td>48</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    A   B   C   D   E\n",
       "0  98  28  24  40  27\n",
       "1  64  89  15   9   8\n",
       "2   5  75  89  18  80\n",
       "3  94   1  59  48  73\n",
       "4  23  12  31  85  48"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data = np.random.randint(0,100,size=(5,5))\n",
    "\n",
    "df=pd.DataFrame(data=data,columns=list('ABCDE'))\n",
    "\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "id": "482b1a14-960d-44b7-ac61-c6cedb69c6c3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>98</td>\n",
       "      <td>28.0</td>\n",
       "      <td>24.0</td>\n",
       "      <td>40</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>64</td>\n",
       "      <td>89.0</td>\n",
       "      <td>15.0</td>\n",
       "      <td>9</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5</td>\n",
       "      <td>NaN</td>\n",
       "      <td>89.0</td>\n",
       "      <td>18</td>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>94</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>48</td>\n",
       "      <td>73</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>23</td>\n",
       "      <td>12.0</td>\n",
       "      <td>31.0</td>\n",
       "      <td>85</td>\n",
       "      <td>48</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    A     B     C   D   E\n",
       "0  98  28.0  24.0  40  27\n",
       "1  64  89.0  15.0   9   8\n",
       "2   5   NaN  89.0  18  80\n",
       "3  94   1.0   NaN  48  73\n",
       "4  23  12.0  31.0  85  48"
      ]
     },
     "execution_count": 60,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[2,'B']=np.nan\n",
    "df.loc[3,'C']=None\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "745c2d63-2705-4d52-9c1d-d090835132e7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(np.float64(nan), np.float64(nan))"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.loc[2,'B'],df.loc[3,'C'] # (np.float64(nan), np.float64(nan))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7ae5e35b-fae5-4006-bc9c-284b1f2b94aa",
   "metadata": {},
   "source": [
    "**2)Pandas中的None和np.nan的操作**"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6939f723-7360-4049-b8ff-ee433fd15104",
   "metadata": {},
   "source": [
    "- isnull\n",
    "- notnull\n",
    "- all\n",
    "- any\n",
    "- dropna():过滤丢失数据\n",
    "- fillna():填充丢失数据"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9b8c202c-e0d8-41ab-aa9d-0d075eec2112",
   "metadata": {},
   "source": [
    "（1）判断函数\n",
    "- isnull()\n",
    "- notnull()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "d21c03cd-fa23-4a12-9075-0efa4c8f07c6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>66</td>\n",
       "      <td>16.0</td>\n",
       "      <td>55.0</td>\n",
       "      <td>16</td>\n",
       "      <td>28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>15</td>\n",
       "      <td>40.0</td>\n",
       "      <td>67.0</td>\n",
       "      <td>75</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>38</td>\n",
       "      <td>NaN</td>\n",
       "      <td>90.0</td>\n",
       "      <td>0</td>\n",
       "      <td>44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>68</td>\n",
       "      <td>81.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>44</td>\n",
       "      <td>85</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>84</td>\n",
       "      <td>79.0</td>\n",
       "      <td>48.0</td>\n",
       "      <td>6</td>\n",
       "      <td>56</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    A     B     C   D   E\n",
       "0  66  16.0  55.0  16  28\n",
       "1  15  40.0  67.0  75   6\n",
       "2  38   NaN  90.0   0  44\n",
       "3  68  81.0   NaN  44  85\n",
       "4  84  79.0  48.0   6  56"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "c0e12cd8-a597-4518-84e9-aab1acc68999",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>False</td>\n",
       "      <td>True</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>True</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       A      B      C      D      E\n",
       "0  False  False  False  False  False\n",
       "1  False  False  False  False  False\n",
       "2  False   True  False  False  False\n",
       "3  False  False   True  False  False\n",
       "4  False  False  False  False  False"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.isnull()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "ca17906e-2657-4d98-b215-f468da7d15a6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>True</td>\n",
       "      <td>False</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "      <td>False</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "      <td>True</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      A      B      C     D     E\n",
       "0  True   True   True  True  True\n",
       "1  True   True   True  True  True\n",
       "2  True  False   True  True  True\n",
       "3  True   True  False  True  True\n",
       "4  True   True   True  True  True"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.notnull()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "261f721d-0979-41ea-809d-1caf8a7c85b2",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0     True\n",
       "1     True\n",
       "2    False\n",
       "3    False\n",
       "4     True\n",
       "dtype: bool"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# all() : 必须全部为True才为True\n",
    "# any() : 只要有一为True就为True\n",
    "\n",
    "# 列\n",
    "df.isnull().any() # 判断每列的bool值,尽可能找的有空的列\n",
    "df.isnull().all() # 判断每列的bool值,必须全部都为空的列\n",
    "\n",
    "df.notnull().any()\n",
    "df.notnull().all()\n",
    "\n",
    "\n",
    "# 行\n",
    "df.isnull().any(axis=1)# 判断每行的bool值,尽可能找的有空的行\n",
    "df.isnull().all(axis=1)# 判断每行的bool值,必须全部都为空的行\n",
    "\n",
    "df.notnull().any(axis=1)\n",
    "df.notnull().all(axis=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "be4de06f-52f3-43d1-b899-636d82826b61",
   "metadata": {},
   "source": [
    "- 使用bool值索引过滤数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "31ec9341-e000-456a-aa16-589ae0080fb5",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>66</td>\n",
       "      <td>16.0</td>\n",
       "      <td>55.0</td>\n",
       "      <td>16</td>\n",
       "      <td>28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>15</td>\n",
       "      <td>40.0</td>\n",
       "      <td>67.0</td>\n",
       "      <td>75</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>38</td>\n",
       "      <td>NaN</td>\n",
       "      <td>90.0</td>\n",
       "      <td>0</td>\n",
       "      <td>44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>68</td>\n",
       "      <td>81.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>44</td>\n",
       "      <td>85</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>84</td>\n",
       "      <td>79.0</td>\n",
       "      <td>48.0</td>\n",
       "      <td>6</td>\n",
       "      <td>56</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    A     B     C   D   E\n",
       "0  66  16.0  55.0  16  28\n",
       "1  15  40.0  67.0  75   6\n",
       "2  38   NaN  90.0   0  44\n",
       "3  68  81.0   NaN  44  85\n",
       "4  84  79.0  48.0   6  56"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 过滤数据\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "id": "59e715dd-0cc2-431b-b9d8-ca2ea1ff8f64",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0    True\n",
       "1    True\n",
       "2    True\n",
       "3    True\n",
       "4    True\n",
       "dtype: bool"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>98</td>\n",
       "      <td>28</td>\n",
       "      <td>24</td>\n",
       "      <td>40</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>64</td>\n",
       "      <td>89</td>\n",
       "      <td>15</td>\n",
       "      <td>9</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5</td>\n",
       "      <td>75</td>\n",
       "      <td>89</td>\n",
       "      <td>18</td>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>94</td>\n",
       "      <td>1</td>\n",
       "      <td>59</td>\n",
       "      <td>48</td>\n",
       "      <td>73</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>23</td>\n",
       "      <td>12</td>\n",
       "      <td>31</td>\n",
       "      <td>85</td>\n",
       "      <td>48</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    A   B   C   D   E\n",
       "0  98  28  24  40  27\n",
       "1  64  89  15   9   8\n",
       "2   5  75  89  18  80\n",
       "3  94   1  59  48  73\n",
       "4  23  12  31  85  48"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 行过滤\n",
    "cond=df.isnull().any(axis=1)\n",
    "display(~cond)# ~ 取反\n",
    "df[~cond]\n",
    "\n",
    "cond = df.notnull().all(axis=1)\n",
    "df[cond]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "id": "0cd63784-22aa-4d97-885c-841d18a35412",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>98</td>\n",
       "      <td>28</td>\n",
       "      <td>24</td>\n",
       "      <td>40</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>64</td>\n",
       "      <td>89</td>\n",
       "      <td>15</td>\n",
       "      <td>9</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5</td>\n",
       "      <td>75</td>\n",
       "      <td>89</td>\n",
       "      <td>18</td>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>94</td>\n",
       "      <td>1</td>\n",
       "      <td>59</td>\n",
       "      <td>48</td>\n",
       "      <td>73</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>23</td>\n",
       "      <td>12</td>\n",
       "      <td>31</td>\n",
       "      <td>85</td>\n",
       "      <td>48</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    A   B   C   D   E\n",
       "0  98  28  24  40  27\n",
       "1  64  89  15   9   8\n",
       "2   5  75  89  18  80\n",
       "3  94   1  59  48  73\n",
       "4  23  12  31  85  48"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 列过滤\n",
    "cond = df.isnull().any()\n",
    "df.loc[:,~cond]\n",
    "\n",
    "cond = df.notnull().all()\n",
    "df.loc[:,cond]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c33c83db-af23-400f-8d85-2bfddb592111",
   "metadata": {},
   "source": [
    "(2)过滤函数\n",
    "- dropna()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f52b2a1e-9979-41a9-b35d-ccd416245b27",
   "metadata": {},
   "source": [
    "可以选择过滤的是行还是列（默认行）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "id": "3a47ab59-41cf-429e-b9ce-e4ebae2ba943",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>98</td>\n",
       "      <td>28</td>\n",
       "      <td>24</td>\n",
       "      <td>40</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>64</td>\n",
       "      <td>89</td>\n",
       "      <td>15</td>\n",
       "      <td>9</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5</td>\n",
       "      <td>75</td>\n",
       "      <td>89</td>\n",
       "      <td>18</td>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>94</td>\n",
       "      <td>1</td>\n",
       "      <td>59</td>\n",
       "      <td>48</td>\n",
       "      <td>73</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>23</td>\n",
       "      <td>12</td>\n",
       "      <td>31</td>\n",
       "      <td>85</td>\n",
       "      <td>48</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    A   B   C   D   E\n",
       "0  98  28  24  40  27\n",
       "1  64  89  15   9   8\n",
       "2   5  75  89  18  80\n",
       "3  94   1  59  48  73\n",
       "4  23  12  31  85  48"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "id": "8111696e-1572-4a10-8008-77ae66dacd18",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>98</td>\n",
       "      <td>40</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>64</td>\n",
       "      <td>9</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5</td>\n",
       "      <td>18</td>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>94</td>\n",
       "      <td>48</td>\n",
       "      <td>73</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>23</td>\n",
       "      <td>85</td>\n",
       "      <td>48</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    A   D   E\n",
       "0  98  40  27\n",
       "1  64   9   8\n",
       "2   5  18  80\n",
       "3  94  48  73\n",
       "4  23  85  48"
      ]
     },
     "execution_count": 62,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 默认删除有空的行\n",
    "df.dropna()\n",
    "\n",
    "# 删除有空的列\n",
    "df.dropna(axis=1) # axis=1：删除有空的列"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "32fb9a05-c1dd-4422-8a82-d3ba05f8247f",
   "metadata": {},
   "source": [
    "也可选择过滤的方式how='all'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "id": "c02c928e-e601-4a6c-b289-5aa12fa654d7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>98</td>\n",
       "      <td>28.0</td>\n",
       "      <td>24.0</td>\n",
       "      <td>40</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>64</td>\n",
       "      <td>89.0</td>\n",
       "      <td>15.0</td>\n",
       "      <td>9</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5</td>\n",
       "      <td>NaN</td>\n",
       "      <td>89.0</td>\n",
       "      <td>18</td>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>94</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>48</td>\n",
       "      <td>73</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>23</td>\n",
       "      <td>12.0</td>\n",
       "      <td>31.0</td>\n",
       "      <td>85</td>\n",
       "      <td>48</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    A     B     C   D   E\n",
       "0  98  28.0  24.0  40  27\n",
       "1  64  89.0  15.0   9   8\n",
       "2   5   NaN  89.0  18  80\n",
       "3  94   1.0   NaN  48  73\n",
       "4  23  12.0  31.0  85  48"
      ]
     },
     "execution_count": 63,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 默认为any\n",
    "# 一行的所有数据都为nan才会删除\n",
    "df.dropna(how='all')\n",
    "\n",
    "df.dropna(how='all',axis=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aacedc55-73bd-4fb2-9617-64e239ff8810",
   "metadata": {},
   "source": [
    "inplace=True 修改原数据"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "id": "889719f7-5c75-4308-b924-060cf8bdb602",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>98</td>\n",
       "      <td>28.0</td>\n",
       "      <td>24.0</td>\n",
       "      <td>40</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>64</td>\n",
       "      <td>89.0</td>\n",
       "      <td>15.0</td>\n",
       "      <td>9</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5</td>\n",
       "      <td>NaN</td>\n",
       "      <td>89.0</td>\n",
       "      <td>18</td>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>94</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>48</td>\n",
       "      <td>73</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>23</td>\n",
       "      <td>12.0</td>\n",
       "      <td>31.0</td>\n",
       "      <td>85</td>\n",
       "      <td>48</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    A     B     C   D   E\n",
       "0  98  28.0  24.0  40  27\n",
       "1  64  89.0  15.0   9   8\n",
       "2   5   NaN  89.0  18  80\n",
       "3  94   1.0   NaN  48  73\n",
       "4  23  12.0  31.0  85  48"
      ]
     },
     "execution_count": 65,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2=df.copy()\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "id": "9e47a7f3-cd02-41af-adfc-c78b785afe3f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>98</td>\n",
       "      <td>28.0</td>\n",
       "      <td>24.0</td>\n",
       "      <td>40</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>64</td>\n",
       "      <td>89.0</td>\n",
       "      <td>15.0</td>\n",
       "      <td>9</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>23</td>\n",
       "      <td>12.0</td>\n",
       "      <td>31.0</td>\n",
       "      <td>85</td>\n",
       "      <td>48</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    A     B     C   D   E\n",
       "0  98  28.0  24.0  40  27\n",
       "1  64  89.0  15.0   9   8\n",
       "4  23  12.0  31.0  85  48"
      ]
     },
     "execution_count": 66,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.dropna(inplace=True)\n",
    "df2"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "717d5a3c-b8af-4bc9-a50a-01fe1c04b9d0",
   "metadata": {},
   "source": [
    "（3）填充函数Series/DataFrame"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "84f27d45-f893-4271-b9f1-2447abe43972",
   "metadata": {},
   "source": [
    "- fillna()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "id": "78db088a-c51e-4a73-9bae-f5f7e3c4d7a9",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>98</td>\n",
       "      <td>28.0</td>\n",
       "      <td>24.0</td>\n",
       "      <td>40</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>64</td>\n",
       "      <td>89.0</td>\n",
       "      <td>15.0</td>\n",
       "      <td>9</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5</td>\n",
       "      <td>NaN</td>\n",
       "      <td>89.0</td>\n",
       "      <td>18</td>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>94</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>48</td>\n",
       "      <td>73</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>23</td>\n",
       "      <td>12.0</td>\n",
       "      <td>31.0</td>\n",
       "      <td>85</td>\n",
       "      <td>48</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    A     B     C   D   E\n",
       "0  98  28.0  24.0  40  27\n",
       "1  64  89.0  15.0   9   8\n",
       "2   5   NaN  89.0  18  80\n",
       "3  94   1.0   NaN  48  73\n",
       "4  23  12.0  31.0  85  48"
      ]
     },
     "execution_count": 68,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "id": "9439c56a-ae14-4624-874e-1d9e722eef19",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>98</td>\n",
       "      <td>28.0</td>\n",
       "      <td>24.0</td>\n",
       "      <td>40</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>64</td>\n",
       "      <td>89.0</td>\n",
       "      <td>15.0</td>\n",
       "      <td>9</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5</td>\n",
       "      <td>100.0</td>\n",
       "      <td>89.0</td>\n",
       "      <td>18</td>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>94</td>\n",
       "      <td>1.0</td>\n",
       "      <td>100.0</td>\n",
       "      <td>48</td>\n",
       "      <td>73</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>23</td>\n",
       "      <td>12.0</td>\n",
       "      <td>31.0</td>\n",
       "      <td>85</td>\n",
       "      <td>48</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    A      B      C   D   E\n",
       "0  98   28.0   24.0  40  27\n",
       "1  64   89.0   15.0   9   8\n",
       "2   5  100.0   89.0  18  80\n",
       "3  94    1.0  100.0  48  73\n",
       "4  23   12.0   31.0  85  48"
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 填充nan\n",
    "df.fillna(value=100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "id": "a9150e4f-6443-4712-adde-896298be7115",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>98</td>\n",
       "      <td>28.0</td>\n",
       "      <td>24.0</td>\n",
       "      <td>40</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>64</td>\n",
       "      <td>NaN</td>\n",
       "      <td>15.0</td>\n",
       "      <td>9</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>18</td>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>94</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>48</td>\n",
       "      <td>73</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>23</td>\n",
       "      <td>12.0</td>\n",
       "      <td>31.0</td>\n",
       "      <td>85</td>\n",
       "      <td>48</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    A     B     C   D   E\n",
       "0  98  28.0  24.0  40  27\n",
       "1  64   NaN  15.0   9   8\n",
       "2   5   NaN   NaN  18  80\n",
       "3  94   1.0   NaN  48  73\n",
       "4  23  12.0  31.0  85  48"
      ]
     },
     "execution_count": 71,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2=df.copy()\n",
    "df2.loc[1,'B']=np.nan\n",
    "df2.loc[2,'C']=np.nan\n",
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "id": "8af5c8cb-d34d-49b1-864c-27b1312acb61",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>98</td>\n",
       "      <td>28.0</td>\n",
       "      <td>24.0</td>\n",
       "      <td>40</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>64</td>\n",
       "      <td>100.0</td>\n",
       "      <td>15.0</td>\n",
       "      <td>9</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5</td>\n",
       "      <td>100.0</td>\n",
       "      <td>100.0</td>\n",
       "      <td>18</td>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>94</td>\n",
       "      <td>1.0</td>\n",
       "      <td>100.0</td>\n",
       "      <td>48</td>\n",
       "      <td>73</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>23</td>\n",
       "      <td>12.0</td>\n",
       "      <td>31.0</td>\n",
       "      <td>85</td>\n",
       "      <td>48</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    A      B      C   D   E\n",
       "0  98   28.0   24.0  40  27\n",
       "1  64  100.0   15.0   9   8\n",
       "2   5  100.0  100.0  18  80\n",
       "3  94    1.0  100.0  48  73\n",
       "4  23   12.0   31.0  85  48"
      ]
     },
     "execution_count": 76,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# limit限制填充次数\n",
    "df2.fillna(value=100,limit=1,inplace=True)\n",
    "df2"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "993befcf-3e3d-4be6-a559-29c674dfc7e3",
   "metadata": {},
   "source": [
    "可以选择前向填充还是后向填充"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "id": "9981483e-ed74-4a25-be94-8059592abf03",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>98</td>\n",
       "      <td>28.0</td>\n",
       "      <td>24.0</td>\n",
       "      <td>40</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>64</td>\n",
       "      <td>89.0</td>\n",
       "      <td>15.0</td>\n",
       "      <td>9</td>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5</td>\n",
       "      <td>NaN</td>\n",
       "      <td>89.0</td>\n",
       "      <td>18</td>\n",
       "      <td>80</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>94</td>\n",
       "      <td>1.0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>48</td>\n",
       "      <td>73</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>23</td>\n",
       "      <td>12.0</td>\n",
       "      <td>31.0</td>\n",
       "      <td>85</td>\n",
       "      <td>48</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    A     B     C   D   E\n",
       "0  98  28.0  24.0  40  27\n",
       "1  64  89.0  15.0   9   8\n",
       "2   5   NaN  89.0  18  80\n",
       "3  94   1.0   NaN  48  73\n",
       "4  23  12.0  31.0  85  48"
      ]
     },
     "execution_count": 77,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "id": "e9098181-8225-4fb2-83c2-45dcce0b0d38",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\21232\\AppData\\Local\\Temp\\ipykernel_20672\\3499483326.py:1: FutureWarning: DataFrame.fillna with 'method' is deprecated and will raise in a future version. Use obj.ffill() or obj.bfill() instead.\n",
      "  df.fillna(method='ffill') # 向前填充\n",
      "C:\\Users\\21232\\AppData\\Local\\Temp\\ipykernel_20672\\3499483326.py:2: FutureWarning: DataFrame.fillna with 'method' is deprecated and will raise in a future version. Use obj.ffill() or obj.bfill() instead.\n",
      "  df.fillna(method='backfill')# 相下填充\n",
      "C:\\Users\\21232\\AppData\\Local\\Temp\\ipykernel_20672\\3499483326.py:3: FutureWarning: DataFrame.fillna with 'method' is deprecated and will raise in a future version. Use obj.ffill() or obj.bfill() instead.\n",
      "  df.fillna(method='ffill',axis=1)# 相左填充\n",
      "C:\\Users\\21232\\AppData\\Local\\Temp\\ipykernel_20672\\3499483326.py:4: FutureWarning: DataFrame.fillna with 'method' is deprecated and will raise in a future version. Use obj.ffill() or obj.bfill() instead.\n",
      "  df.fillna(method='backfill',axis=1)# 相右填充\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>C</th>\n",
       "      <th>D</th>\n",
       "      <th>E</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>98.0</td>\n",
       "      <td>28.0</td>\n",
       "      <td>24.0</td>\n",
       "      <td>40.0</td>\n",
       "      <td>27.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>64.0</td>\n",
       "      <td>89.0</td>\n",
       "      <td>15.0</td>\n",
       "      <td>9.0</td>\n",
       "      <td>8.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>5.0</td>\n",
       "      <td>89.0</td>\n",
       "      <td>89.0</td>\n",
       "      <td>18.0</td>\n",
       "      <td>80.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>94.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>48.0</td>\n",
       "      <td>48.0</td>\n",
       "      <td>73.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>23.0</td>\n",
       "      <td>12.0</td>\n",
       "      <td>31.0</td>\n",
       "      <td>85.0</td>\n",
       "      <td>48.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      A     B     C     D     E\n",
       "0  98.0  28.0  24.0  40.0  27.0\n",
       "1  64.0  89.0  15.0   9.0   8.0\n",
       "2   5.0  89.0  89.0  18.0  80.0\n",
       "3  94.0   1.0  48.0  48.0  73.0\n",
       "4  23.0  12.0  31.0  85.0  48.0"
      ]
     },
     "execution_count": 80,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.fillna(method='ffill') # 向前填充\n",
    "df.fillna(method='backfill')# 相下填充\n",
    "df.fillna(method='ffill',axis=1)# 相左填充\n",
    "df.fillna(method='backfill',axis=1)# 相右填充"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
